Grapheme-to-Phoneme Conversion in Mandarin Chinese Text-to-Speech System

نویسندگان

  • Hongwei Ding
  • Oliver Jokisch
چکیده

We present a lexicon-based model for segmenting Chinese text into dictionary entries and for providing pronunciations for these words. This approach adopts a matching algorithm combined with several heuristic rules to resolve the ambiguities. It can achieve total accuracy over 95%, which proved to be an effective solution to grapheme-to-phoneme conversion for Mandarin Chinese. Introduction The written Chinese texts are composed with strings of characters without blanks to delimit words. The first step towards word-based indexing is to break a sequence of characters into words. This process is called word segmentation. On the other hand, it is not possible to bypass the word-segmentation problem. The main reason is that many Chinese characters are homographs, whose pronunciation depends upon word affiliation. The Problem of Word Segmentation There are difficulties with the word identification process. First of all, almost all characters are free morphemes, which can be one-character words by themselves. They can also join other characters to form multi-character words. Second, compounding is the predominant word formation device in modern Chinese. It is difficult to tell whether a lowfrequency compound is a word or phrase. Third, the same pool of characters is also used in constructing proper names, which brings difficulty in personal name identification [2]. Strategies in Word Segmentation In order to cope with this problem, there exist some methods which can be classified into (1) Purely statistical approaches [1]; (2) Heuristic rule-based methods [2]; (3) Statistical approaches which incorporate lexical knowledge [3]. Many statistical methods are based on a large pre-segmented text corpus for their analysis. The easiest and most effective one is the lexical based algorithm with supplementary rules. This is also adopted in our TTS system DRESS, but is modified to pass our system. The paper first introduces our synthesis system. It then presents the solution of word identification and phonetic conversion. Finally, it points out the possibility for future research. Synthesis System The Mandarin Chinese Text-to-Speech system developed at TU Dresden is a syllable-based waveform concatenation synthesis. It consists of text analysis and acoustic synthesis. The acoustic synthesis is already accomplished with high naturalness. A syllable-based inventory takes the crosssyllable co-articulation into consideration [4]. A neural network is responsible for learning and modifying the duration and intonation [5]. Because of the unsolved problem of grapheme-to-phoneme conversion, the word boundaries had been inserted manually in the process of synthesis. This paper presents the solution of word segmentation, which makes the whole text-to-speech system to operate automatically. Word Segmentation The processing stage of word segmentation includes an algorithm of maximum matching with word lexicon, several ambiguity resolution rules, and some solutions to deal with time, numeral expressions and to identify personal names. Input text (A string of Chinese characters) Figure 1: Grapheme-phoneme conversion Maximum Path-Matching The lexical-based word identification approach is matching, the basic strategy is to match the input characters string with a large set of entries stored in a pre-compiled lexicon to find all (or part of) possible segmentations. Another variant of maximal matching done in [2] says that the most plausible segmentation is the three-word chunks with maximal length. This algorithm is adopted in our system. Word Identification Word Lexicon Ambiguity Resolution Rules Time & Numerals Expressions Name Identity Grapheme-Phoneme Conversion Prosodic Generation Phonetic Sequences with Tones Segmented Words in Characters Word Lexicon with Phonetic Transcription CFA/DAGA'04, Strasbourg, 22-25/03/2004

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS system

Grapheme-to-phoneme (G2P) conversion is an important component in a Text-to-Speech (TTS) system. The difficulty in Chinese G2P conversion is to pick out one correct pronunciation from several candidates according to the context information. By evaluating the distribution of polyphones in a corpus with manually corrected pinyin transcriptions, this paper pointed out that the overall error rate o...

متن کامل

An Efficient Way to Learn Rules for Grapheme-to-Phoneme Conversion in Chinese

Grapheme-to-phoneme (G2P) conversion is a very important component in a Text-to-Speech (TTS) system. Determining the pronunciation of polyphone characters is the main problem that the G2P component in a Mandarin TTS system faces. By studying the distribution of polyphones and their characteristics in a large text corpus with corrected pinyin transcriptions, this paper points out that correct G2...

متن کامل

Grapheme-to-phoneme conversion for Chinese text-to-speech

This paper reports a study of grapheme-to-phoneme (G2P) conversion for Chinese text-to-speech (TTS) system. As Chinese is a syllabic language, syllable is commonly adopted as the phonetic unit in TTS, which is represented by pinyin, the standard Chinese romanization. A Chinese G2P conversion is to find correct pinyin for polyphonic graphemes in the input text. In this paper, a complete G2P fram...

متن کامل

Rule-based Korean Grapheme to Phoneme Conversion Using Sound Patterns

Grapheme-to-phoneme conversion plays an important role in text-to-speech applications and other fields of computational linguistics. Although Korean uses a phonemic writing system, it must have a grapheme-to-phoneme conversion for speech synthesis because Korean writing system does not always reflect its actual pronunciations. This paper describes a grapheme-to-phoneme conversion method based o...

متن کامل

Grapheme to phoneme conversion: an Arabic dialect case

We aim to develop a Speech-to-Speech translation system between Modern Standard Arabic and Algiers dialect. Such a system must include a Text-to-Speech module which itself must include a Grapheme-to-Phoneme converter. Algiers dialect is an Arabic dialect concerned by the most problems of Modern Standard Arabic in NLP area. Furthermore, it could be considered as an under-resourced language becau...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009